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Fast Fourier transform (FFT) is widely used in digital signal processing and 
telecommunications, particularly in orthogonal frequency division 
multiplexing systems, to overcome the problems associated with orthogonal 
subcarriers. A new algorithm of radix-3 FFT has been introduced in this 
work. The DFT of length N can be realized from three DFT sequences; each 
of length N/3.Radix-3 algorithm reduces the number of multiplications 
required for realizing DFT.A novel design of Radix-3pipelined Single path 
Delay Feedback (R3SDF) FFT using MCSLA has been proposed in this 
paper. First, the pipelined radix-3 SDF FFT method has been designed. It has 
less area and large power consumption and delay. In order to overcome these 
problems, modified carry select adder structure is used to perform the adder 
operation for reducing the power consumption and delay. Finally, the 
MCSLA is integrated into radix-3 SDF FFT processor. The hardware 
complexity and execution time for implementing radix-3 FFT algorithm can 
be reduced than other FFTs. 
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1. INTRODUCTION 

Discrete Fourier transform (DFT) is crucial in recent telecommunications and digital signal 
processing, though this method tends to be computationally rigorous. To conquer this problem, Cooley and 
Tukey developed the fast Fourier transform (FFT), which has verified predominantly expensive for 
applications involving orthogonal frequency division multiplexing (OFDM), such as Worldwide 
Interoperability for Microwave Access (WiMAX), long-term evolution (LTE), asymmetric digital subscriber 
line (DSL), very-high-speed DSL, and digital audio/video broadcasting (DAB/DVB) systems. 

To reduce power consumption and hardware costs, different types of FFT processors has been 
developed. The memory-based architecture gives a low-power result, though this method suffers from long 
delay and may need extra buffer space for system synchronization. The pipelined Single-path Delay 
Feedback (SDF) FFT architecture has been developed to reduce the memory mandatory for memory-based 
architectures. This approach includes N—1 delay elements, in which the multiplication accounts for less than 
50% of the computation and the control unit design is relatively straightforward. These features are specific 
advantageous in high-performance designs involving portable digital signal processing devices. 

Fixed radix FFT's such as radix 3 FFT are considered to be competitively proficient to radix 2 FFT. 
In this paper, a new algorithm of pipelined radix-3 SDF FFT using MCSLA has been designed to reduce the 
number of multiplications. In this radix-3 FFT, the modified carry select adder has been used to perform 
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addition operation to reduce the power consumption and also to improve the performance 
of the FFT processor. 


2. BACKGROUND 

A novel algorithm for execution of radix 3, 6, and 12 FFT has been explained in [1]. The FFT 
algorithm is evaluated in an ordinary (1,j) complex plane and the number of additions can be extensively 
reduced, the number of multiplication is also reduced. A well-organized approach to calculate Discrete 
Fourier Transform (DFT) using Radix-3 algorithm, which is a Fast Fourier Transform (FFT), has been 
described in [2]. Compared to existing one, it has less multiplication. The matrix created by various powers 
of twiddle factor is disintegrated into two matrices and it has been revealed that it takes the complex 
multiplications are less to calculate the result than unique Cooley-Tukey algorithm. 

The hardware implementation of mixed radix FFTs with cores of radix 5 and radix 3 as well as the 
standard radix 2 core has been presented in [3]. The mixed radix FFT is more costly than the radix 2 
implementation .A mixed radix FFT of 1200 points need 36 real multipliers in the implementation of 
pipelined FFT whereas a 2048 radix 2 FFT needs 30 real multipliers. A radix-3 FFT has been described in [4] 
which the element three-point DFT’s needs no multiplications. This results in a reduction in the number of 
multiplications but a concurrent increase in the number of additions. The algorithm will show an advantage 
of processors which require more time for multiplication than addition. 

A novel FFT algorithm has been developed in [5] together with the design of pipelined architecture. 
The proposed algorithm has been used to reduce the number of complex multipliers in addition to the size of 
twiddle factor ROMs. It is proved to be appropriate for large size of FFT VLSI implementation. These FFT 
architectures are designed for OFDM applications in [6]-[7]. A novel architecture for efficient method of Fast 
Fourier Transform (FFT) processor [8] to gather the necessities of high speed wireless communication 
system standards. This paper develops an optimal constant multiplication arithmetic design to multiply a 
fixed point input by means of one of the numerous current twiddle factor constants. 


3. RADIX-3 FFT ALGORITHM 

Radix-3 FFT algorithm is used to compute Discrete Fourier Transform (DFT).It takes less 
multiplication than the normal one.Radix-3 FFT algorithm is mainly based on divide and conquer method. 
It decomposes an N-point DFT into sequentially smaller DFTs.As soon as the number of data points is power 
of 3 (i.e., 3n). 

The radix-3 algorithm for realization of DFT of length N=3n (n=1, 2, 3,...). The DFT of length N 
can be realized from three DFT sequences, each of length N/3.If the input signal has length N, direct 
calculation of DFT needs O(N2) complex multiplications.Radix-3 algorithm which is utilized to reduce the 
multiplications. The processing time and hardware complexity for implementing radix-3 DFT algorithm can 
be reduced. 

The DFT of N points is given by 


N-1, . —j27kn 
Z(k)= Yx(ne” Le , for k=0,1,2,....N-1 (1) 
n=0 
Where j=V-1 
Z(k) = Transformed Data 
—j27kn 
Substitute wki=e Nin (1) 
N 
N-1 
Z{k)= Tan)why 2) 
n=0 


wp is known as twiddle factor. 
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The output components are Z(0),Z(1),Z(2),......Z(N-1) are arranged in three groups namely 
Z(3r),Z(3r+1) and Z(N-3r-1), where r=0,1,2,...,N/3-1. 
The following expressions can be derived from equations (2), 





a4 
m: N 2N kn 
Z(3r)= ie an) {n+ )4(n4 2) hk (3) 
Where k=3r and r=0, 1, 2,..., N/3-1. 
ns 
Z(3r + 1)= = Ln) wh{N / 3) i{n f x) +wkl2N / s(n + v) wh (4) 


Where k=3r+1 and r=0, 1, 2,..., N/3-1. 


a 


av-ar-I=¥) alo) Wien) (nN ys 6) 
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Where k= (N-3r-1) and r=0, 1, 2... N/3-1. 
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K = -sin (21/3) 


Figure 1. Flow Graph of Radix-3 FFT 


4. PROPOSED PIPELINED STRUCTURE OF RADIX-3 SDF FFT USING MCSLA 

In this paper, a new algorithm of pipelined structure based Radix-3 SDF FFT has been designed for 
improving the speed. Radix-3 FFT, which is used to reduce the number of multiplication. Single path Delay 
Feedback FFT is a pipelined based frequency transformation technique. In SDF FFT, the inputs are given in 
serial manner. The SDF FFT provides high speed operation. This FFT structure consumes more delay and 
power consumption [9] due to utilizing or storing bulk of unwanted intermediate processing signals. SDF 
FFT structures have the most proficient memory utilization for pipelined FFT processors. 

Figure 2 shows that the architecture of Radix-3 pipelined SDF FFT. This architecture consists of 
Processing Element (PE), Delay and Twiddle Factor values. Addition and subtraction operation has been 
performed in the processing element. Initially, the input data of real and imaginary values are given to the 
first stage. Then the input values are delayed by 1. The delayed values are given to the processing element. 

In the processing element, the addition and subtraction operations are done. After that the values are 
multiplied by twiddle factor values. Finally, the first stage of output values is controlled by using 
multiplexor. The first stage output is fed back to the input of second stage. Similarly, the first stage operation 
has been done in the second stage and third stage. The main disadvantage of single path delay feedback FFT 
is large power consumption. To overcome this problem, modified carry select adder [10] has been integrated 
into Radix-3 SDF FFT to perform the efficient adder operation for reducing the power consumption. 
Modified the full adder structure in the normal CSLA [11] by reducing the number of gates, it is called as 
Modified Carry Select Adder. 
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Figure 2. Architecture of Radix-3 Pipelined SDF FFT 
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Figure 3. Structure of modified carry select adder 


For performing the operation of 3-bit addition, Full Adder circuit consists of 2 XOR gate, 2 AND 
gate and 1 OR gate. The Full Adder gate count value is 13. RFA circuit [7] has been designed by using 
minimum number of logic gates. It consists of 2 AND gate, 1 OR gate, 2 NOT gate and | multiplexer. The 
RFA gate count value is 9.Multiplexer (MUX) based Reduced Full Adder circuit has been designed in this 
paper for improving the performance of digital adder circuits. The structure of modified CSLA 
is shown in Figure 3. 

In the modified CSLA [7] has been used for reducing the power consumption and improving the 
performance of FFT processor. Compared to regular CSLA, the modified CSLA [7] gives better 
performance. Finally, the modified CSLA has been integrated into pipelined radix-3 SDF FFT. 


5. RESULTS AND DISCUSSION 

By using Verilog Hardware Description Language (Verilog HDL), the Radix-3 Single-path Delay 
Feedback (R3SDF) FFT using Modified Carry Select Adder (MCSLA) has been developed. The simulation 
and synthesis results have been evaluated and estimated by using ModelSim 6.3c and Xilinx 10.1i design 
tool. The simulation result of proposed Radix-3 pipelined SDF FFT using MCSLA is shown in Figure 4. 
Comparison analysis of Radix-3 SDF FFT and Radix-3 SDF FFT using MCSLA is shown in Figure 5. 
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Figure 4. Simulation Result of Proposed Pipelined Radix-3 SDF FFT using Modified Carry Select Adder 


Table 1. Comparison Analysis of Pipelined R3SDF FFT and Pipelined R3SDF FFT 
using Modified Carry Select Adder 








Types Slices LUTs Delay (ns) Power Consumption (mW) 
Pipelined Radix-3 SDF FFT 170 318 15.173 475 
Pipelined Radix-3 SDF FFT using Modified 100 180 10.507 118 
Carry Select Adder 
% Reduction 41.17 43.39 30.75 75.15 





Table 1 shows that the number of slices is 170 and100, the number of LUTs is 318 and180 , the 
delay is 15.173ns and 10.507ns and the power consumption is 475mW and 118mW in pipelined R3SDF FFT 
and R3SDF FFT using Modified Carry Select Adder. 
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Figure 5. Performance Evaluation of pipelined Radix-3 SDF FFT and Radix-3 SDF FFT using MCSLA 


6. CONCLUSION 
In this paper, a novelty design of pipelined Radix-3 Singlepath Delay Feedback (R3SDF) FFT using 


modified CSLA has been proposed. The proposed radix-3 FFT has less number of multiplications compared 
to other FFTs.This algorithm is competitive in speed with radix 2 FFT. In pipelined Radix-3 SDF FFT, the 
area has been reduced but the delay and power consumption has been increased. In order to overcome this 
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problem, modified carry select adder has been used to perform adder operation in Radix-3 SDF FFT. 
To reduce the area, latency and power consumption is the main motive of this paper. The proposed method 
offers 41.17% lessening in occupied slices, 43.39% decrease in LUTs, 30.75% reduction in delay and 
75.15% lessening in power consumption than the Radix-3 SDF FFT. 
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